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Abstract —The Compressive Sensing framework maintains rel¬ 
evance even when the available measurements are subject to 
extreme quantization, as is exemplified by the so-called one- 
bit compressed sensing framework which aims to recover a 
signal from measnrements reduced to only their sign-bit. In 
applications, it is often the case that we have some knowledge 
of the structure of the signal beforehand, and thns wonid like 
to leverage it to attain more accnrate and efficient recovery. 
This work explores avennes for incorporating such partial- 
support information into the one-bit setting. Experimental resnits 
demonstrate that newly proposed methods of this work yield 
improved signal recovery even for varying levels of accuracy 
in the prior information. This work is thns the first to provide 
recovery mechanisms that efficiently use prior signal information 
in the one-bit reconstrnction setting. 

I. Introduction 

Compressed Sensing (CS) addresses the problem of accu¬ 
rately acquiring high dimensional signals from a set of rela¬ 
tively few linear measurements HI Cl 13 . The problem can 
be formulated mathematically via the system y = $a:, where 
q, g (^mxn j-jjg measurement matrix. In the compressed 
setting, m <^n but one utilizes the assumption that the signal 
X possesses some additional structure, such as sparsity, we 
say that x is k-sparse when 

||x||o := |supp(x)| = k^n. 

CS has seen a vast amount of progress (see e.g. a, B), and 
it is now well-known that for suitable matrices $ (for example 
i.i.d. Gaussian), any fc-sparse vector x can be recovered from 
y G C™ when m Ri klog(n/k). 

Unfortunately, the majority of theoretical work in CS as¬ 
sumes that the measurements are acquired with infinite preci¬ 
sion whereas in practice they must be quantized. The extreme 
quantization setting where only the sign bit is acquired is 
known as one-bit compressed sensing fh). In this framework, 
the measurements now take the form yi = sign((x, (j>i)) where 
(pi denotes the *th row of the measurement matrix d). Typically 
one then loses the ability to recover the magnitude of x and 
thus assumes the signal has a fixed norm (e.g. unit-norm), 
although there are adaptive techniques to overcome this as 

well Q, m. 

A. Existing one-bit methods 

Although one-bit CS is a relatively new technology, efficient 
recovery algorithms have been studied. There are mainly two 
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types of methods, those based on linear programming 0, 
HHl, HU and on iterative approaches H3, H3. In this work 
we focus on the iterative approach, and in particular the 
Binary Iterative Hard Thresholding Algorithm (BIHT) E3, 
an extension of the traditional Iterative Hard Thresholding 
Algorithm (IHT) HI. Assuming that our desired signal is 
fc-sparse, the objective of BIHT is to return a solution that 
is /c-sparse and consistent with the given sign measurements. 
Viewed as solving an optimization problem, at each iteration 
BIHT computes and takes a step in the direction of the gradient 
to attain a new approximation. This approximation is then 
thresholded to retain only the k largest in magnitude entries. 
Einally, after a consistent approximation is attained or enough 
iterations have elapsed, the estimation is normalized and 
returned. Algorithm [T] presents a more detailed explanation of 
BIHT. Here and throughout we use the thresholding function 
prune{z, k) which returns the vector z with all but the k 
largest in magnitude entries set to zero. 


Algorithm 1 Binary Iterative Hard Thresholding (BIHT). 
Given: measurement matrix <I>, one-bit measurements y, as¬ 
sumed sparsity level k, gradient step-size r 

1 

procedure BIHT($, y, k) 


2 

X = 0 [> Initialize trivial approximation of x 

3 

repeat 


4 

r = X -1- §$'(?/ — sign{^Xi)) 

> Gradient step 

5 

X = prune{T, k) 

> Hard threshold 

6 

until halting criterion satisfied 


7 


[> Normalize 

8 

end procedure 



II. Methods using partial support 
In many applications, it is not only known that the signal 
of interest is sparse, but additional information about the 
support of the signal may also be known. Eor example, it 
is well-known that the support of wavelet representations 
of natural images largely resides in the low frequencies. In 
distributed settings, the signals of interest may be highly 
correlated so that partial support information may be obtained 
from neighboring atoms. This framework can be modeled by 
assuming that there is some partial estimate T of the signal 
support T := supp(x) a priori. The work of Mansour et.al. 
H3 ED extend the conventional fi-minimization method to 
a weighted ii approach that effectively incorporates such a 






support estimate. To incorporate the estimate T, consider the 
following program: 


n 

min> Wi\xi\ s.t. y = <i>a: with lUi 
2=1 


c € [0,1] if Wi GT 
1 if Wi ^ T. 


This program now imposes a penalty for placing non-zero 
entries in locations not specified in the support estimate T. 
The value c in the weight vector w can be determined by the 
confidence in each element of T. We build upon this notion 
of weighting the estimation vector by developing analogous 
methods for the one-bit setting, focusing on iterative methods. 


A. Oracle estimation 


As a first, most basic approach, let us assume that our 
support estimate T is completely accurate, i.e. T = T. At 
the pruning step, no matter our result, we could simply set 
all entries of T not in T to zero; instead of locating the k 
largest entries of T, we would naively only retain the entries 
of T. The entries of our new estimate would be determined 
as follows: 


Xi = 


Tiif i&f 
0 if i e f"^. 


Standard BIHT vs. BIHT with Full Support Known, Soft and Hard Thresholding 



Fig. 1. Performance of BIHT when support estimate is exact and we employ 
soft and hard thresholding (c = 0). 


according to four distinct possibilities: 


Xi =TiWi where Wi = 


{ 1 if t e f n T 
1 if i e T n T'^’ 

I- pifi&f^ of 
Qifi&f^O f^. 


As a similar approach, we could instead try soft-thresholding 
the entries of T not in T, multiplying the entries of T not 
in T by some constant 0 < c < 1. Figure [T] shows the 
result of this approach; here and throughout, unless otherwise 
noted the signal length is n = 256, sparsity level fc = 8, $ 
has standard normal entries, r = 0.001, the support of the 
signal X is distributed uniformly at random, the magnitudes 
of the non-zero entries are standard normal, and the algorithm 
is run until the estimate changes by less than 10“^° or after 
1000 iterations. Figure [T] shows the mean squared error (MSE) 
averaged over 100 trials for various values of m, the number 
of measurements. At each iteration, entries of F that are not 
in the support estimate are scaled down by a factor c, so that 
when c = 0 the result is simple hard-thresholding. Observe 
unsurprisingly how powerful a perfect support estimate can 
be via this bold hard-thresholding strategy. Figure [T] also 
shows that soft-thresholding with the perfect support estimate 
performs identically to hard-thresholding. This result is not 
hard to understand: after j iterations the elements not in T have 
been scaled down by cf which approaches zero after many 
iterations. Of course, knowing the full support beforehand is 
not likely, but these examples show that this information can 
seriously expedite the recovery of x. 

B. Soft thresholding 

In practice our support estimate will seldom equal the true 
support, and thus at a given iteration we should consider both 
T and the locations of the k greatest in magnitude entries of F, 
denoted T, when updating our estimate. Let us assume that we 
know the support estimate to be (p x 100)% accurate. Now, in 
the pruning step of BIHT the entries of F may be thresholded 


Additionally, we can use this 4-set framework when our 
support estimate includes erroneous elements. Suppose we 
have a support estimate that contains (p x k) correct el¬ 
ements but also includes (1 — p) x k incorrect elements. 
Figure |2] shows BIHT’s performance when using this 4-set 
representation to incorporate prior support information. In 
the case of no false positives, when no elements of our 
support estimate are incorrect, we see that performance of 
BIHT is not bad: as p increases the MSE decays. Similarly, 
with the inclusion of false positives, performance is intuitive: 
improvements are seen when there are more correct estimates 
than incorrect estimates, i.e. p > 0.5; for lesser values of 
p the support estimate consists mostly of incorrect elements 
and performance is worse than standard BIHT. Also note that 
under this 4-set representation a fc-sparse approximation is not 
necessarily returned. If T n T = 0, then in fact a 2fc-sparse 
solution is returned, certainly this will result in a less accurate 
solution. Perhaps this is the reason for the slower rate of error 
decay in comparison with standard BIHT, as seen in Eigure |2] 

C. Supervised weighting 

As a means for incorporating a partial support estimate 
and returning a fc-sparse solution, we may use the weighting 
framework for traditional -minimization as presented in 
ina. We refer to this model as supervised, since the partial 
support estimate T must be obtained a priori, external to the 
algorithm itself. Again, suppose we believe our estimate T to 
be (px 100)% accurate. Let us create a weight vector w where 

1 if i G f 
1 - p if i G 






BIHT-PS, 4 set representation, No False Positives 




Fig. 2. Performance of BIHT using soft thresholding. In (a) we know the 
estimate to contain pk correct elements (no false positives), in (b) there are 
an additional (1 — p)k incon'ect elements. 


Then, at some iteration of BIHT, once we compute T let us 
multiply component-wise T and w to attain tjj = T 0 w. 
Now we locate the k largest elements of ip, denoted := 
prune{ip, k), and hard threshold the elements of T € , i.e. 

set them equal to zero. This approach is very similar to BIHT 
when there is no prior support estimate, except that here T is 
instead pruned to retain the k greatest in magnitude elements 
of r 0 ru. A more thorough break down of this procedure 
is presented in Algorithm We see that this supervised 
weighting approach outperforms the 4-set soft thresholding 
formulation both when false positives are and are not included. 
Figure |3] shows that for every value of m, when some prior 
information exists, the weighting approach performs the same 
as or better than standard BIHT. 

This weighting framework for incorporating partial support 
information into BIHT (BIHT-PSW) performs well in the 
current context. However, we are assuming that the value for 
p is correct. This is a very bold assumption that, in practice, 
is not likely to hold. In the weighting step, the value of p 
determines how diminished the magnitude of an entry off the 
support estimate will be. If we are more confident in certain 
entries being non-zero, then other entries will be scaled by 
a constant closer to zero. Empirical results, as displayed in 
Figure |4] show that using the correct value of p is crucial to 


BIHT-PS, Supervised Weighting, No False Positives 



BIHT-PS, Supervised Weighting, Includes False Positives 



Fig. 3. Performance of BIHT when using supervised weighting. In (a) we 
know the estimate to contain pk correct elements (no false positives), in (b) 
there are an additional (1 — p)fc incon'ect elements. 


BIHT-PS, Supervised Weighting, Incorrect p 



Fig. 4. Effect of inaccurate choice of p in supervised weighting. For incorrect 
p (cyan), p = 0.1 (rather than p = 0.9) was used in the weight vector. 

the incorporation of a partial support estimate. 

D. Unsupervised re-weighting 

The improvements of this approach prompt one to ask 
whether such a result can be leveraged even when no support 
estimate is available. If we take m measurements of x and 
use BIHT to attain some estimate x, then we may use the 
support of X, denoted T, as an estimate for T. Then, we could 














































BIHT, Unsupervised Re-weighting 


Algorithm 2 Binary Iterative Hard Thresholding with Partial 
Support Estimate Weighting (BIHT-PSW). Given: measure¬ 
ment matrix <1>, one-bit measurements y, sparsity level k, 
support estimate T, accuracy of support estimate p, step-size 

T 


1 : procedure BIHT-PSW(<1>, y, k, T, p) 

Ui- p) if 

t> Construct weights 

1 if i e T 


Wi = 


\> Initialize trivial approximation of x 


i = 0 

repeat 

r = X -I- ^^'{y — sign{^x’‘)) > Gradient step 

fl = prune{T Qw,k) > Prune weighted update 


x, = 


Ej if j G H 
0 if j G 


> Update approximation 


8: until halting criterion is satisfied 

9: return -n^r- > Normalize 

10 : end procedure 


use T to run BIHT-PSW, getting a more accurate approxima¬ 
tion, and so on. This is reminiscent of the re-weighted £i- 
minimization approach in classical compressed sensing ini, 
and is the general idea behind the BIHT Unsupervised Re¬ 
weighting (BIHT-URW) algorithm, presented as Algorithm 
[3 We refer to this model as unsupervised, since no outside 
information about the support is required. In this case, since 
p is certainly unknown, we utilize a parameter A in place 
of p. The performance of BIHT-URW is displayed in Figure 
|5] Unfortunately, we observe no improvement from standard 
BIHT; we conjecture this is perhaps due to the arbitrary 
selection of A (and thus p) within the method, which we 
have set to A = 0.5 in Figure |5] As a benchmark for 
optimal performance of BIHT-URW we tested the method 
using a weight vector from the actual signal x (rather than 
its approximation), and then ran the algorithm as usual. This 
resulted in a significant improvement in performance, as shown 
in the cyan curve of Figure |3 Of course, this method is not 
applicable in practice, but demonstrates the potential of such 
an approach if better ways of estimating A can be obtained, 
possibly adapting from iteration to iteration. 




Fig. 5. PerfoiTnance of BIHT-URW for various re-weighting iterations. In 
(a) the sparsity level is /c = 8, and in (b) k = 20. In both figures, the cyan 
(0) line is a result of creating a weight vector out of the desired signal x. 


III. Conclusion 

We presented several weighting variants of the BIHT al¬ 
gorithm for one-bit CS when partial support estimation is 
known. We demonstrate that our methods effectively utilize 
the support information, but that leveraging support estimates 
in an unsupervised fashion is not straightforward. We believe 
future work in this area could lead to re-weighted methods 
which improve upon existing approaches. 


Algorithm 3 Binary Iterative Hard Thresholding with Unsu¬ 
pervised Re-weighting. Given: measurement matrix $, one- 
bit measurements y, sparsity level k, step-size r, accuracy 
parameter A, number of re-weighting iterations n 
1 : procedure BIHT-URW($, y, k, n) 

2: T = supp(i3/iET($, y, fc)) > BIHT support estimate 

3: repeat 

4: i = BIHT-PSW($,?/,fc,r,A) 0 Estimate 

5: T = supp(x) > Update support estimate 

6: until n iterations have completed 

7: return > Normalize 

8 : end procedure 
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